theoretical model
Self-Interest and Systemic Benefits: Emergence of Collective Rationality in Mixed Autonomy Traffic Through Deep Reinforcement Learning
Chen, Di, Li, Jia, Zhang, Michael
Autonomous vehicles (AVs) are expected to be commercially available in the near future, leading to mixed autonomy traffic consisting of both AVs and human-driven vehicles (HVs). Although numerous studies have shown that AVs can be deployed to benefit the overall traffic system performance by incorporating system-level goals into their decision making, it is not clear whether the benefits still exist when agents act out of self-interest -- a trait common to all driving agents, both human and autonomous. This study aims to understand whether self-interested AVs can bring benefits to all driving agents in mixed autonomy traffic systems. The research is centered on the concept of collective rationality (CR). This concept, originating from game theory and behavioral economics, means that driving agents may cooperate collectively even when pursuing individual interests. Our recent research has proven the existence of CR in an analytical game-theoretical model and empirically in mixed human-driven traffic. In this paper, we demonstrate that CR can be attained among driving agents trained using deep reinforcement learning (DRL) with a simple reward design. We examine the extent to which self-interested traffic agents can achieve CR without directly incorporating system-level objectives. Results show that CR consistently emerges in various scenarios, which indicates the robustness of this property. We also postulate a mechanism to explain the emergence of CR in the microscopic and dynamic environment and verify it based on simulation evidence. This research suggests the possibility of leveraging advanced learning methods (such as federated learning) to achieve collective cooperation among self-interested driving agents in mixed-autonomy systems.
- North America > United States > California > Yolo County > Davis (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Washington (0.04)
- (2 more...)
- Research Report > New Finding (0.86)
- Research Report > Experimental Study (0.66)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology (0.67)
Do GNN-based QEC Decoders Require Classical Knowledge? Evaluating the Efficacy of Knowledge Distillation from MWPM
Quantum computers hold the potential to outperform classical computers on specific computational problems, but their realization is hindered by the fragility of qubits due to decoherence. Quantum Error Correction (QEC) is an essential technology to overcome this challenge, enabling the detection and correction of errors by redundantly encoding a single logical qubit into multiple physical qubits. The performance of QEC is critically dependent on the classical "decoder" algorithm, which interprets the error syndrome to deduce the appropriate correction operation. The standard decoder for the surface code, Minimum-Weight Perfect Matching (MWPM) [1], performs well under a simplified noise model where errors are assumed to be independent and identically distributed (i.i.d.). However, noise in real quantum devices exhibits complex spatio-temporal correlations, and the discrepancy between the theoretical model and reality can degrade the decoder's performance. To address this, decoders based on machine learning, such as Graph Neural Networks (GNNs), have emerged as a promising alternative [2, 3]. GNNs have the ability to learn error patterns directly from data. It is generally anticipated that injecting physical prior knowledge into a GNN should improve its performance. Specifically, "knowledge distillation" [4], which transfers the knowledge of theoretical error structures from MWPM to a GNN, is considered a concrete method to realize this hypothesis. 1
Modelling Chemical Reaction Networks using Neural Ordinary Differential Equations
Thöni, Anna C. M., Robinson, William E., Bachrach, Yoram, Huck, Wilhelm T. S., Kachman, Tal
In chemical reaction network theory, ordinary differential equations are used to model the temporal change of chemical species concentration. As the functional form of these ordinary differential equations systems is derived from an empirical model of the reaction network, it may be incomplete. Our approach aims to elucidate these hidden insights in the reaction network by combining dynamic modelling with deep learning in the form of neural ordinary differential equations. Our contributions not only help to identify the shortcomings of existing empirical models but also assist the design of future reaction networks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
How Strategic Agents Respond: Comparing Analytical Models with LLM-Generated Responses in Strategic Classification
Xie, Tian, Rauch, Pavan, Zhang, Xueru
When machine learning (ML) algorithms are used to automate human-related decisions, human agents may gain knowledge of the decision policy and behave strategically to obtain desirable outcomes. Strategic Classification (SC) has been proposed to address the interplay between agents and decision-makers. Prior work on SC has relied on assumptions that agents are perfectly or approximately rational, responding to decision policies by maximizing their utilities. Verifying these assumptions is challenging due to the difficulty of collecting real-world agent responses. Meanwhile, the growing adoption of large language models (LLMs) makes it increasingly likely that human agents in SC settings will seek advice from these tools. We propose using strategic advice generated by LLMs to simulate human agent responses in SC. Specifically, we examine five critical SC scenarios -- hiring, loan applications, school admissions, personal income, and public assistance programs -- and simulate how human agents with diverse profiles seek advice from LLMs. We then compare the resulting agent responses with the best responses generated by existing theoretical models. Our findings reveal that: (i) LLMs and theoretical models generally lead to agent score or qualification changes in the same direction across most settings, with both achieving similar levels of fairness; (ii) state-of-the-art commercial LLMs (e.g., GPT-3.5, GPT-4) consistently provide helpful suggestions, though these suggestions typically do not result in maximal score or qualification improvements; and (iii) LLMs tend to produce more diverse agent responses, often favoring more balanced effort allocation strategies. These results suggest that theoretical models align with LLMs to some extent and that leveraging LLMs to simulate more realistic agent responses offers a promising approach to designing trustworthy ML systems.
- Asia > Middle East > Israel > Southern District > Eilat (0.04)
- North America > United States > Ohio (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Education > Educational Setting > Higher Education (0.49)
- Banking & Finance > Loans (0.48)
BayesNAM: Leveraging Inconsistency for Reliable Explanations
Kim, Hoki, Park, Jinseong, Choi, Yujin, Lee, Seungyun, Lee, Jaewook
Neural additive model (NAM) is a recently proposed explainable artificial intelligence (XAI) method that utilizes neural network-based architectures. Given the advantages of neural networks, NAMs provide intuitive explanations for their predictions with high model performance. In this paper, we analyze a critical yet overlooked phenomenon: NAMs often produce inconsistent explanations, even when using the same architecture and dataset. Traditionally, such inconsistencies have been viewed as issues to be resolved. However, we argue instead that these inconsistencies can provide valuable explanations within the given data model. Through a simple theoretical framework, we demonstrate that these inconsistencies are not mere artifacts but emerge naturally in datasets with multiple important features. To effectively leverage this information, we introduce a novel framework, Bayesian Neural Additive Model (BayesNAM), which integrates Bayesian neural networks and feature dropout, with theoretical proof demonstrating that feature dropout effectively captures model inconsistencies. Our experiments demonstrate that BayesNAM effectively reveals potential problems such as insufficient data or structural limitations of the model, providing more reliable explanations and potential remedies.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Europe > Italy > Sardinia > Cagliari (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Dark energy reconstruction analysis with artificial neural networks: Application on simulated Supernova Ia data from Rubin Observatory
Mitra, Ayan, Gómez-Vargas, Isidro, Zarikas, Vasilios
In this paper, we present an analysis of Supernova Ia (SNIa) distance moduli $\mu(z)$ and dark energy using an Artificial Neural Network (ANN) reconstruction based on LSST simulated three-year SNIa data. The ANNs employed in this study utilize genetic algorithms for hyperparameter tuning and Monte Carlo Dropout for predictions. Our ANN reconstruction architecture is capable of modeling both the distance moduli and their associated statistical errors given redshift values. We compare the performance of the ANN-based reconstruction with two theoretical dark energy models: $\Lambda$CDM and Chevallier-Linder-Polarski (CPL). Bayesian analysis is conducted for these theoretical models using the LSST simulations and compared with observations from Pantheon and Pantheon+ SNIa real data. We demonstrate that our model-independent ANN reconstruction is consistent with both theoretical models. Performance metrics and statistical tests reveal that the ANN produces distance modulus estimates that align well with the LSST dataset and exhibit only minor discrepancies with $\Lambda$CDM and CPL.
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- South America > Chile (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (7 more...)
Breaking Neural Network Scaling Laws with Modularity
Boopathy, Akhilan, Jiang, Sunshine, Yue, William, Hwang, Jaedong, Iyer, Abhiram, Fiete, Ila
Modular neural networks outperform nonmodular neural networks on tasks ranging from visual question answering to robotics. These performance improvements are thought to be due to modular networks' superior ability to model the compositional and combinatorial structure of real-world problems. However, a theoretical explanation of how modularity improves generalizability, and how to leverage task modularity while training networks remains elusive. Using recent theoretical progress in explaining neural network generalization, we investigate how the amount of training data required to generalize on a task varies with the intrinsic dimensionality of a task's input. We show theoretically that when applied to modularly structured tasks, while nonmodular networks require an exponential number of samples with task dimensionality, modular networks' sample complexity is independent of task dimensionality: modular networks can generalize in high dimensions. We then develop a novel learning rule for modular networks to exploit this advantage and empirically show the improved generalization of the rule, both in- and out-of-distribution, on high-dimensional, modular tasks.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Locomotion Dynamics of an Underactuated Three-Link Robotic Vehicle
The wheeled three-link snake robot is a well-known example of an underactuated system modelled using nonholonomic constraints, preventing lateral slippage (skid) of the wheels. A kinematically controlled configuration assumes that both joint angles are directly prescribed as phase-shifted periodic input. In another configuration of the robot, only one joint is periodically actuated while the second joint is passively governed by a visco-elastic torsion spring. In our work, we constructed the two configurations of the wheeled robot and conducted motion experiments under different actuation inputs. Analysis of the motion tracking measurements reveals a significant amount of wheels' skid, in contrast to the assumptions used in standard nonholonomic models. Therefore, we propose modified dynamic models which include wheels' skid and viscous friction forces, as well as rolling resistance. After parameter fitting, these dynamic models reach good agreement with the motion measurements, including effects of input's frequency on the mean speed and net displacement per period. This illustrates the importance of incorporating wheels' skid and friction into the system's model.
- North America > United States > Michigan (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (4 more...)
Artificial Agency and Large Language Models
van Lier, Maud, Muñoz-Gil, Gorka
The arrival of Large Language Models (LLMs) has stirred up philosophical debates about the possibility of realizing agency in an artificial manner. In this work we contribute to the debate by presenting a theoretical model that can be used as a threshold conception for artificial agents. The model defines agents as systems whose actions and goals are always influenced by a dynamic framework of factors that consists of the agent's accessible history, its adaptive repertoire and its external environment. This framework, in turn, is influenced by the actions that the agent takes and the goals that it forms. We show with the help of the model that state-of-the-art LLMs are not agents yet, but that there are elements to them that suggest a way forward. The paper argues that a combination of the agent architecture presented in Park et al. (2023) together with the use of modules like the Coscientist in Boiko et al. (2023) could potentially be a way to realize agency in an artificial manner. We end the paper by reflecting on the obstacles one might face in building such an artificial agent and by presenting possible directions for future research.
- Europe > Austria > Tyrol > Innsbruck (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
Bayesian Co-navigation: Dynamic Designing of the Materials Digital Twins via Active Learning
Slautin, Boris N., Liu, Yongtao, Funakubo, Hiroshi, Vasudevan, Rama K., Ziatdinov, Maxim A., Kalinin, Sergei V.
Scientific advancement is universally based on the dynamic interplay between theoretical insights, modelling, and experimental discoveries. However, this feedback loop is often slow, including delayed community interactions and the gradual integration of experimental data into theoretical frameworks. This challenge is particularly exacerbated in domains dealing with high-dimensional object spaces, such as molecules and complex microstructures. Hence, the integration of theory within automated and autonomous experimental setups, or theory in the loop automated experiment, is emerging as a crucial objective for accelerating scientific research. The critical aspect is not only to use theory but also on-the-fly theory updates during the experiment. Here, we introduce a method for integrating theory into the loop through Bayesian co-navigation of theoretical model space and experimentation. Our approach leverages the concurrent development of surrogate models for both simulation and experimental domains at the rates determined by latencies and costs of experiments and computation, alongside the adjustment of control parameters within theoretical models to minimize epistemic uncertainty over the experimental object spaces. This methodology facilitates the creation of digital twins of material structures, encompassing both the surrogate model of behavior that includes the correlative part and the theoretical model itself. While demonstrated here within the context of functional responses in ferroelectric materials, our approach holds promise for broader applications, the exploration of optical properties in nanoclusters, microstructure-dependent properties in complex materials, and properties of molecular systems. The analysis code that supports the funding is publicly available at https://github.com/Slautin/2024_Co-navigation/tree/main
- North America > United States > Tennessee > Knox County > Knoxville (0.14)
- North America > United States > Washington > Benton County > Richland (0.04)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- (5 more...)